paper simplified
Papers Simplified: »Anticorrelated Noise Injection for Improved Generalization«
In this article, I will not explain to you all of the (exciting!) Instead, I will provide you with some implementations and pictures that should make it possible to understand the gist of the paper. I also gave my best to create an implementation of the optimizers mentioned in the paper, but use the code with care because I'm also not an expert in this regard. In order to understand what Anti-PGD (Anti-Perturbed Gradient Descent) is about, let us shortly recap how GD and the derived algorithms such as SGD and PGD work. Let us assume that we want to minimize a function f with a gradient denoted as f(θ).